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Abstract 


This study examined the contributions of students’ noncognitive characteristics toward explaining 
performance on the ACT® test, over and above traditional predictors such as high school grade point 
average (HSGPA), coursework taken, and school characteristics. The sample consisted of 6,440 
high school seniors from 4,541 schools who took the ACT in the fall of 2012 and completed an 
online questionnaire about their high school experience, study and work habits, parental involvement, 
educational and occupational plans and goals, and college courses taken and/or credits earned in 
high school. Twelve percent of the total sample responded and met the study inclusion criteria. 

A blockwise regression model with cluster-robust standard errors was used to assess the 
relationships between cognitive and noncognitive characteristics with ACT scores. The total 
variance in ACT scores accounted for by available student and school characteristics ranged from 
44% (reading) to 61 % (Composite). HSGPA explained the most variance (20% to 31 %), high 
school coursework taken explained an additional 8% (reading) to 17% (mathematics), and high 
school characteristics accounted for an additional 7% to 9% of the variance in ACT scores. After 
accounting for these traditional predictors, students' noncognitive factors explained between 4% 
and 7% of the variance in ACT scores. These noncognitive characteristics included indicators of 
needing help in improving subject-related academic skills, educational plans, parental involvement, 
perceptions of education, and taking the ACT test prior to senior year. Socioeconomic status and 
other demographic characteristics accounted for less variance in ACT scores (4% or below), after 
adjusting for other student and school characteristics. Moreover, adjusted mean score differences 
among racial/ethnic, family income, and parental education groups were substantially reduced, 
compared to unadjusted group differences. Students’ noncognitive characteristics were also highly 
related to HSGPA. Study findings suggest that noncognitive characteristics affect ACT scores 
directly as well as through their impact on HSGPA. 

In light of the growing interest for evaluating both cognitive and noncognitive measures of college 
readiness, the results from this study may help to provide better context and guidance for the 
interpretation of college readiness measures. These findings also contribute to a more holistic 
understanding of college readiness. This perspective is important in order to better understand the 
multidimensional nature of college and career readiness and subsequent success. 


Introduction 


To meet their admission goals, four-year postsecondary institutions often rely on academic 
measures to help determine the likelihood that a student will be successful in college (Clinedinst, 
2015). Academic measures often include grades in college preparatory courses, strength of high 
school curriculum, standardized test scores (the ACT or SAT), and high school grade point average 
(HSGPA), because these measures have been found to be useful in identifying students who are 
ready for college and predicting students’ eventual success in college (Kobrin, Patterson, Shaw, 
Mattern, & Barbuti, 2008; Radunzel & Noble, 2012; Sawyer, 2010). 

Recently, there has been an increased interest in taking a more holistic approach to evaluating 
students’ college readiness levels to better equip students with the knowledge, skills, and support 
they need to succeed in college (Farrington, Roderick, Allensworth, Nagaoka, Keyes, Johnson, & 
Beechum, 2012; Mattern, Burrus, Camara, O’Connor, Hanson, Gambrell, Casillas, & Bobek, 2014). 
This interest stems from the growing body of research that suggests that other noncognitive 
characteristics can improve college success predictions beyond those based on academic measures 
alone. 1 For example, in a meta-analysis of factors predicting college outcomes, Robbins, Lauver, 

Le, Davis, Langley, and Carlstrom (2004) argued that noncognitive characteristics can help to 
account for some of the remaining variability unaccounted for by academic measures. Specifically, 
the authors found that noncognitive factors such as motivation, academic goals, and academic 
self-efficacy 2 were significantly related to college grades and retention, even after controlling for 
socioeconomic status (SES), HSGPA, and ACT/SAT scores. Other noncognitive characteristics that 
have been found to be useful for predicting various measures of academic success in K-12 and/ 
or college include: academic and social integration (Milem & Berger, 1997; Tinto, 1993), parental 
involvement (Flint, 1992; Henderson & Mapp, 2002), study attitudes (Zimmerman, Parks, Gray, 

& Michael, 1977), personality (Ridgell & Lounsbury, 2004), student involvement (Astin, 1993), 
problem-solving (Le, Casillas, Robbins, & Langley, 2005), student engagement (Lee & Shute, 2009), 
behavioral learning strategies (Lee & Shute, 2009), and conscientiousness (Poropat, 2009). 

In a recent study Gaertner and McClarty (2015) conducted an investigation of both cognitive 
and noncognitive factors affecting a college readiness index derived from students' HSGPA and 
ACT/SAT scores. Using data from the National Education Longitudinal Study of 1988, these 
researchers predicted their college readiness index from six component scores identified from 
a principal components analysis. These components, derived from 140 middle school variables, 
included: academic achievement, motivation and commitment, behavior, social engagement, family 
circumstances, and school characteristics. They found that about one third of the variability in their 
college readiness index was accounted for by the motivation, social, and behavior components; 
nearly 70% was explained by all six components. 


1 The definition of noncognitive characteristics (sometimes referred to as psychosocial or non-academic characteristics) can vary 
widely, but in general, the term has become a catchall phrase used to describe any factors beyond standardized test scores, HSGPA, 
coursework taken, class rank, and student demographics (Sommerfeld, 2011). 

2 According to Robbins et al. (2004), the construct definition for academic self-efficacy is a “self-evaluation of one’s ability and/or 
chances for success in the academic environment." 
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Additional prior research has shown that high school grades and coursework are related to 
standardized achievement scores. For example, Noble, Davenport, Schiel, and Pommerich (1999a, 
1999b) found that HSGPA accounts for nearly 40% of the variance in ACT Composite scores. 
Specific coursework taken in high school accounted for 9% of the variability in ACT Composite 
scores above that explained by HSGPA (Noble et al., 1999a, 1999b). This, however, leaves more 
than half of the variability in ACT scores unaccounted for. 

Noble et al. (1999a, 1999b) also found that although HSGPA and ACT scores are related and have 
some noncognitive predictors in common, there are some noncognitive predictors related to HSGPA 
that are not directly related to ACT scores. Moreover, the two variables measure different aspects 
of academic achievement. ACT scores are reported on a scale that maintains the same meaning 
across years and across high schools; for this reason, these scores are not affected by differential 
grading standards. The scores reflect the level of educational achievement at a moment in time, 
often at the end of a student’s junior year or beginning of the senior year in high school. HSGPA, 
in contrast, reflects performance in courses over the duration of high school. Some research on 
HSGPA indicates that teachers explicitly consider behavior and other noncognitive characteristics 
in assigning grades (Campbell, 2011). In this way, HSGPA is not only affected by level of content 
mastery, but also by the courses taken and a student's personal behaviors, such as whether the 
student is prudent about taking good notes, putting forth effort and participating in class, completing 
homework assignments, and preparing well for course exams. 

In addition to student characteristics, previous research has shown academic performance to vary 
as a function of school characteristics. For example, past studies have found substantial variability 
among schools in the academic achievement levels of their students (as measured by ACT scores), 
even after accounting for differences in their students' characteristics (Sawyer, 2008). Moreover, 
relationships have been documented between academic outcomes (e.g., HSGPA, standardized 
tests, high school dropout) and school-level characteristics such as percentage of students on free/ 
reduced lunch (Swanson, 2004), public school status (Lubienski, Lubienski, & Crane, 2008), wealth 
of the community (Kim & Sunderman, 2005), and racial/ethnic composition of the school (Cook 
& Evans, 2000). Measures of school culture and climate, such as those related to college-going 
expectations/aspirations and teacher quality, have also been suggested to be informative in studying 
student achievement (MacNeil, Prater, & Busch, 2009; Oseguera, 2013). Other studies have 
reported that schools that serve high proportions of low-income and racial/ethnic minority students 
often find it challenging to retain effective teachers due to poor work environments where effective 
teaching and learning are not supported (Johnson, Kraft, & Papay, 2012). 

Average ACT scores also substantially differ among racial/ethnic and family income groups and 
by parental education level (ACT, 2014a; ACT & Council for Opportunity in Education, 2013; ACT 
& National Council for Community and Education Partnerships, 2013). For example, for the ACT- 
tested graduating class of 2014, average ACT Composite scores for Hispanic and African American 
students are over 3.0 and 5.0 scale score points lower than that for White students (18.8, 17.0, and 
22.3, respectively). However, differences also exist among these demographic groups in average 
HSGPA and in the mathematics and science courses typically taken in high school (Radunzel, 2015). 

Given the desire to have all students graduate from high school prepared for college or the 
workforce, more research is needed to understand the multitude of characteristics that foster 
readiness among students so that we can better prepare them for future success. As such, 
the primary purpose of this study is to explore the effects of noncognitive factors and student 
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demographic characteristics on ACT test scores above and beyond HSGPA, high school coursework 
taken, and aspects of students' school environment, with an emphasis on noncognitive measures 
related to academic goals, behaviors, self-perceptions, and parental involvement. This study will 
build upon the work of Noble et al. (1999a, 1999b) and Gaertner and McClarty (2015) by using 
more proximal measures of academic preparation for a more recent cohort of high school students. 
Second, we will examine how mean differences in ACT scores among demographic groups (e.g., 
race/ethnicity, parental education, family income, gender) change after other cognitive, school- 
related, and noncognitive characteristics are taken into account. Third, to better understand 
differences in factors related to HSGPA and standardized test scores, we will investigate the extent 
to which noncognitive characteristics influence HSGPA, a traditional predictor of ACT scores. Fourth, 
from a methodological perspective, this study illustrates applications of statistical techniques that are 
not commonly used in educational research. 


Data 


Data Collection 

Sampling Frame 

The sampling frame consisted of the registration records of all US high school seniors from the 
2013 high school graduating class who registered for the October or December 2012 national 
test dates of the ACT. Individuals were removed from the sampling frame if they did not provide a 
valid email address. Additionally, students who had been selected for participation in other recent 
ACT projects or studies were also removed from the sampling frame. The final sampling frames for 
the October and December 2012 test administrations included 296,890 and 279,148 high school 
seniors, respectively. 

Sampling Procedure 

For each of the two test dates, 28,000 registrants were randomly sampled from the sampling 
frame using simple random sampling as the sample selection method— that is, every student in the 
sampling frame had an equal probability of selection and sampling was without replacement. This 
resulted in the selection of 56,000 registrants. Student characteristics were found to be similar 
between the random sample and the sampling frame. 

Next, emails were sent to the sampled registrants inviting them to complete an online questionnaire 
on the Monday after the date of the ACT test administration (which took place on Saturday for both 
administrations). Reminder emails to non-respondents were sent four and ten days after the initial 
contact. A third reminder email was also sent to the non-respondents from the December sample. 

A total of 8,447 registrants responded to the online questionnaire (15%). Of these respondents, 

1 ,222 (14%) did not take the ACT test that they registered for. Of the 7,225 registrants with ACT 
test scores, 785 (11 %) either responded to only the first item on the questionnaire about whether 
they took the ACT in October or December 2012, or did not indicate plans of enrolling in college and 
therefore were not administered all of the remaining questionnaire questions. The final sample for 
the study included 6,440 college-bound high school seniors from the 2013 ACT-tested high school 
graduating class (12% of the initial sample). 
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Instruments 

Data for this study were taken from two sources: the ACT, and an online questionnaire developed to 
collect information about students' academic engagement, parental involvement in college planning, 
and students’ own college intentions, plans, expectations, commitments, and financial concerns. 

The ACT 

The ACT is a curriculum-based educational achievement test taken by nearly two million students 
each year. It consists of four academic tests in English, mathematics, reading, and science, and an 
optional writing test. The tests are designed to measure skills that are acquired in high school and 
that are important for success in the first year of college (ACT, 2013). 3 The ACT Composite score is 
the arithmetic average of the scores for the four academic tests (English, mathematics, reading, and 
science). Scores are reported on a scale of 1 to 36. The ACT English, mathematics, reading, science, 
and Composite scores were used as the dependent variables (outcome measures) for the study. A 
brief description of the four academic tests is provided below. 

■ The English test is a 75-question, 45-minute test, covering usage/mechanics (such as 
punctuation, grammar and usage, and sentence structure) and rhetorical skills (such as strategy, 
organization, and style). 

■ The mathematics test is a 60-question, 60-minute test designed to measure the mathematical 
skills students have typically acquired in courses taken by the end of 11 th grade. It covers six 
content areas: Pre-Algebra, Elementary Algebra, Intermediate Algebra, Coordinate Geometry, 
Plane Geometry, and Trigonometry. 

■ The reading test is a 40-question, 35-minute test that measures reading comprehension. The 
reading test is based on four types of reading selections: social studies, natural sciences, literary 
narrative or prose fiction, and humanities. 

■ The science test is a 40-question, 35-minute test that measures the skills required in the natural 
sciences: interpretation, analysis, evaluation, reasoning, and problem solving. 

At the time students register to take the ACT, they are asked to complete a Course Grade 
Information Section (CGIS) and a Student Profile Section (SPS). The CGIS provides information 
about students' coursework and grades in 30 specific high school courses. Students are asked to 
indicate whether they have taken or are currently taking a particular course, or whether they plan to 
take it prior to graduating from high school. For courses already completed, students are also asked 
to indicate the letter grade they received (A-F). Prior studies have shown that students report high 
school coursework and grades with a high degree of accuracy relative to information provided in 
their transcripts (Sanchez & Buddin, 2015; Shaw & Mattern, 2009). From the information provided 
on the CGIS, HSGPA was calculated from 23 specific courses taken in English, mathematics, social 
studies, and science. Subject-specific GPAs were also calculated. 

Course information from the CGIS was also used to examine specific course taking patterns in 
mathematics and science that have been outlined in previous studies (ACT, 2004; Noble & 


These first-year college expectations are summarized in the ACT College and Career Readiness Standards™ (available at www.act. 
org/standard). 
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Schnelker, 2007). Course patterns are constructed such that the incremental benefit of specific 
courses could be determined. In mathematics, the course sequence patterns include: 

■ Less than Algebra I, Algebra II, and Geometry (labeled as Less AAG) 

■ Algebra I, Algebra II, and Geometry (AAG) 

■ Algebra I, Algebra II, Geometry, and other advanced math course (AAGO) 

■ Algebra I, Algebra II, Geometry, and Trigonometry (AAGT) 

■ Algebra I, Algebra II, Geometry, Trigonometry, and other advanced math course (AAGOT) 

■ Algebra I, Algebra II, Geometry, Trigonometry, and Calculus (AAGTC) 

■ Algebra I, Algebra II, Geometry, Trigonometry, Calculus, and other advanced course (AAGOTC) 

■ Other sequence of 3 or more years of mathemafics courses (Other-High) 

■ Other sequence of fewer than 3 years of mathematics courses (Other-Low) 

Sequences in science courses are: 

■ Less than Biology or other sequence of less than 3 years 

■ Biology 

■ Biology and Chemistry 

■ Biology, Chemistry, and Physics 

■ Other sequence of 3 years 

Courses in social studies were also considered but did not follow a clear-cut sequence as in science 
and mathematics. Consequently, Government, Economics, Geography, Psychology, and History (other 
than American or World) were considered individually. English course information was also obtained 
from the CGIS; however, nearly all students reported taking English 9, English 10, English 11, and 
English 12. As a result of the low variability, high school English courses were not included in the 
analysis. 

From the CGIS course information, indicators were developed for whether students had taken a 
core curriculum in English, mathematics, science, and social studies as well as for all subjects. A core 
curriculum was defined as four years of English and three years each of mathematics, science, and 
social studies. The number of years students studied a foreign language was also calculated from 
the CGIS. 

The SPS includes noncognitive information such as students' expected educational attainment (a 
measure of academic goals; Allen, 1999; Cabrera & La Nasa, 2001 ; Eppler & Harju, 1997; Robbins 
et al., 2004) and whether they indicate needing help in improving their skills in a variety of subject 
areas. The SPS also collects demographic and background information as well as information about 
students’ interests, accomplishments, career plans, and perceived need for help with their study skills 
and their educational and occupational plans. 
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Online Questionnaire 

The questionnaire consisted of 48 items asking students to self-report information on areas such 
as their high school experience, study and work habits, parental involvement, educational and 
occupational plans and goals, perceptions of their future college experience, and college courses 
taken (dual-credit coursework) and/or credits earned in high school. All item response options were 
discrete and consisted of five or six-point Likert-type items evaluating respondents’ general level of 
agreement or frequency of partaking in a particular behavior or action. 

Variables from the online questionnaire featured noncognitive characteristics that have been found 
to be predictive of academic performance in other contexts. Students were asked about their 
effort and preparedness in school (academic commitment; Gaertner & McClarty, 2015; Robbins, 

Allen, Casillas, Peterson, & Le, 2006), expected educational outcomes and perceptions of college 
(commitment to college, academic goals; Allen, 1999; Ramist, 1981 ; Robbins et al., 2004), study and 
work habits (Cooper, Robinson, & Patall, 2006; Le et at, 2005; Singh, 1998), parental involvement 
(Comer, 2005; Flint, 1992; Gaertner & McClarty, 2015; Henderson & Mapp, 2002; Lee & Bowen, 
2006), and whether they were challenged in high school. 

High School Characteristics 

Students indicated their high school attended at the time they registered for the ACT. School 
characteristics were obtained from the National Center for Educational Statistics' (NCES) Common 
Core of Data (CCD) for years 2010 to 2012 and Market Data Retrieval (MDR) files. Variables from 
these sources included the school type (public vs. non-public), the percentage of students eligible 
for free/reduced lunch (FRL), and percentage of minority students (the latter two available only 
for public schools). The median household income for the zip code associated with school location 
was obtained from readily available US 2000 Census data. In addition, the following measures 
were calculated for each high school based on the 2011 , 2012, and 2013 ACT-tested high school 
graduating classes: mean ACT scores, the percentage of students taking mathematics coursework 
beyond Algebra II, the percentage of students intending to earn a post-baccalaureate degree, the 
percentage of students taking the ACT test, and the percentage of students immediately enrolling in 
college the fall following high school graduation (enrollment data obtained from the National Student 
Clearinghouse). For schools with fewer than 25 ACT-tested students, district-level means were used 
instead. The high school characteristics calculated for the 2011 through 2013 ACT-tested graduating 
classes were used as proxy measures of a college-going culture that has been found to promote 
students’ college aspirations (Corwin & Tierney, 2007). 4 


4 School demographics characteristics were included in this study as possible proxies for information related to school resources, 
school environment, and quality of education. The school-level characteristics that were calculated for the ACT-tested population were 
the only variables readily available to try to capture other aspects of the school climate/culture. See the Introduction and Discussion 
sections for a more detailed discussion on the relevance of these school-level characteristics. 
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Method 


Weighting 

The sample of respondents differed from the population of interest in terms of key characteristics 
(see Table 1). For example, there were fewer male respondents in the sample than among all 
seniors who tested in the 2012-13 academic year. Therefore, weights were applied to the sample to 
account for the overrepresentation of certain student groups. Students’ responses were weighted 
on a combination of four different variables: gender, HSGPA, race/ethnicity, and ACT Composite 
score. 5 HSGPA was dichotomized into "high” (3.50 or above) and “low” (below 3.50); ACT Composite 
score was categorized into five groups: 1 to 16, 17 to 19, 20 to 22, 23 to 25, and 26 or higher. 6 
The weights ranged from 0.51 for minority females with a high HSGPA and ACT Composite score 
of 26 or above to a weight of 2.24 for other/unknown ethnicity males with a low HSGPA and ACT 
Composite score of 16 or below. These weights were used throughout the rest of the analyses 
presented in this report. 

Table 1 . Unweighted Descriptive Statistics of Study Sample Compared to 201 2-1 3 
ACT-Tested Seniors 





Sample 

201 2-13 HS Seniors 



(n 

= 6,440) 

(n = 

975,702) 

Student Characteristic 


n 

% 

n 

% 


White 

3,505 

54% 

537,967 

55% 


African American 

917 

14% 

151,325 

16% 

Race/Ethnicity 

Hispanic 

1,023 

16% 

147,918 

15% 


Asian 

432 

7% 

45,021 

5% 


Other 

563 

9% 

93,471 

10% 


Male 

2,206 

34% 

429,101 

44% 

Gender 

Female 

4,234 

66% 

546,601 

56% 


1-16 

970 

15% 

197,455 

20% 


17-19 

1,204 

19% 

198,374 

20% 

ACT Composite Score 

20-22 

1,326 

21% 

208,465 

21% 


23-25 

1,251 

19% 

172,870 

18% 


26-36 

1,689 

26% 

198,538 

20% 


< 3.50 

3,072 

48% 

472,832 

56% 

HSGPA 

> 3.50 

3,368 

52% 

370,978 

44% 


Note: Percentages may not sum to 100% due to rounding. Percentages for HSGPA for 2012-13 ACT-tested seniors are conditional 
on a response being given; the percentage of missing responses is 14%. For the sample and the population, the Other racial/ethnic 
category included 3% and 5%, respectively, of students who did not provide their race/ethnicity. 


5 Weights were normalized so that the sum of the weights would equal the number of students included in the sample. 

6 For the weight calculations, race/ethnicity was categorized into the following three categories: White or Asian, minority (African 
American, Hispanic, American Indian, and Pacific Islander), and Other/ Unknown. In other analyses, race/ethnicity was categorized into 
the following five categories: White, Asian, African American, Hispanic, and Other/Unknown. 
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Missing Data 

Some students did not respond to all of the SPS, CGIS, and/or the online questionnaire items. 

In general, the percentage of missing values per item was low, but it did vary across items. One 
of the higher missing rates was for family income at 19%. Apart from two other sets of items to 
be discussed below, the remaining items used in this study were missing for at most 10% of the 
students, with an overwhelming majority of items missing for at most 5% of the students. A few of 
the items that had nearly a 10% missing rate included: parental education level (10%) and HSGPA 
(8%). Multiple imputation (SAS PROC Ml) was used to estimate missing values for these items. Five 
data sets were imputed. Final models were estimated for all five imputed data sets; no differences 
of practical significance in the regression coefficients and significance levels were found across the 
data sets. Consequently, the results reported here are based only on the initial imputed data set. 7 

There were two sets of questions of interest asking non-sensitive information that had much higher 
levels of missingness (over 50%). The first set asked students to indicate whether they had taken 
any advanced, honors, or accelerated courses in five different subject areas. The second set asked 
students to indicate whether they needed help improving their skills in five different areas. Missing 
values were not imputed for these items. Instead, given that more than 90% of students with a 
non-missing response indicated that they took the course or needed help, omitted responses were 
considered as a “have not taken the course” or “no help needed” response. 

Principal Components Analysis of the Online Questionnaire 

Because many items on the online questionnaire addressed somewhat related areas, a principal 
components analysis (PCA) 8 was used as a data reduction technique to produce a smaller, more 
manageable set of components that would also serve to reduce collinearity concerns. Most, but not 
all, of the items from the online questionnaire were included in this analysis; individual items that 
were of particular interest as potential predictors of ACT test scores based on prior research studies 
were excluded from the PCA. All items on the online questionnaire were discrete, with most being 
Likert-type items (which often violate assumptions of traditional PCA); therefore, the PCA was run 
on a polychoric correlation matrix that assumed a latent normal distribution as recommended by 
Kolenikov and Angeles (2009). To determine the number of components to retain, Horn's Parallel 
Analysis (Horn, 1965) was used. 9 Using an orthogonal varimax rotation, 10 nine components were 
extracted and accounted for 60% of the variance of the original set of variables. 

Of the nine extracted components, only two were deemed to be relevant as possible predictors for 
the outcomes of interest in this study. The first consisted of items related to how engaged students 


7 Although it is typically important to take the multiple imputations into account so that standard errors are not biased towards zero 
(Little & Rubin, 1989), given the categorical nature of the variables used in this study and the relatively low missing rates, analyzing 
only one of the imputed datasets should not lead to any downward bias in standard error estimates (Schafer, 1999). In general, 
categorical variables experience less random variation between imputations because they can only take on discrete values. Therefore, 
the between-imputation variance (which accounts for random variation between imputations) will be rather small, especially given the 
relatively low missing rates for most study variables. 

8 Although an exploratory factor analysis (EFA) may also have been a method for this purpose, EFA is more useful for discerning latent 
constructs underlying a set of variables. With the online questionnaire, dimension reduction was the more salient intention compared to 
uncovering the underlying latent structure, thus PCA was the selected technique. 

9 Horn Parallel Analysis (HPA) is a resampling procedure that generates multiple artificial datasets of the same sample size and number 
of variables as the PCA of interest, but all variables are generated such that they are completely independent of one another. A PCA is 
run on each generated dataset and the eigenvalues are saved. The mean eigenvalue for each component is then calculated across all 
replicated data sets. If the eigenvalue from the analysis exceeds the value from HPA, then, based on HPA, the component should be 
retained. This method is related to Kaiser's Rule, which retains components with eigenvalues greater than 1.0. 

10 An oblique rotation was initially considered but only two component correlations had magnitudes greater than 0.20. 
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were with their academics in high school such as being prepared for assessments and turning in 
assignments on time. This component, labeled as the academic commitment component, had a 
Cronbach’s a of 0.73. The second component was related to students’ perceptions of education- 
such as college being worth the cost, and completion of a college degree being a priority. The 
perceptions of education component had a Cronbach’s a of 0.44. Specific items that loaded highly 
on each of these components can be found in Table A-1 of Appendix A. 

For two other PCA components, the individual items were considered to be of greater interest and 
more interpretable than the overall component on which they loaded. For example, one component 
consisted of items pertaining to indicators of whether students had taken dual-credit coursework in 
high school in five different subject areas, as well as an overall item concerning total college credits 
earned while in high school. We were more interested in relating the subject-specific dual-credit 
course information to performance on the corresponding ACT subject test than using an overall 
aggregate measure of any dual-credit coursework. The other component was comprised of three 
parental involvement items. Two items were closely related and pertained to parental involvement 
in post-high school plans. The third item addressed a different aspect of parental involvement by 
asking students how often their parents check that they have completed their schoolwork. We were 
interested in evaluating the effects of these two different aspects of parental involvement separately. 
Therefore, individual items from these latter two PCA components were evaluated as possible 
predictors of performance on the ACT tests. 

The other five PCA components were deemed to be more relevant for predicting college 
outcomes (e.g., who is likely to enroll in college, who is likely to persist in college) than for 
predicting performance on the ACT. These components were related to students’ (1) uncertainty 
in setting career and educational goals (e.g., it is difficult deciding what occupation fits me best), 

(2) participation in college planning activities (e.g., I have taken steps to learn about scholarship 
opportunities), (3) college financial concerns (e.g., I am concerned about how I am going to pay for 
my education after high school), (4) likely college behaviors (e.g., I will likely take at least one term 
off after enrolling but before graduating), and (5) exploration of college options (I have attended a 
college fair). 

Clustered Nature of the Data 

Although students were sampled through a simple random sample, they were naturally clustered 
within both schools and states. Although clustered data often necessitate methods that account for 
clustering, it is important first to assess whether the clustering is meaningful. The primary method 
used to address this concern is the intraclass correlation (ICC). While no strict cutoffs exist for 
determining what values correspond to being meaningfully clustered, ICC values above 0.05 or 0.10 
typically require methods to account for clustering to avoid downwardly biased standard errors and, 
consequently, inflated type-1 error rates (Plox, 1998). 

For this study, ICCs ranged from 0.20 (mathematics and reading) to 0.23 (Composite and science). 11 
These are fairly large ICC values and are in line with what Hedges and Hedberg (2007) found in 
a review of ICC values for math and reading scores in high school students (0.17 for 12th grade 
reading, 0.24 for 12th grade math). The ICC estimates suggest that modeling the data with a single- 


11 It is also important to note that the ICCs may be slightly inflated as Clarke (2008) and McNeish (2014) found that sparse data 
structures (often defined by an average cluster size below 5) tend to overestimate the variance of the intercept random effect. 
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level model that does not account for clustering might lead to downwardly biased standard errors 
and inflated type-1 error rates. States were also considered as a third level of clustering; however, 
ICCs at the state-level were less than 0.05 for all ACT tests and, consequently, only clustering within 
schools was considered in subsequent analyses. 

It is worth noting that there were three groups of students who were not traditionally clustered 
within schools: home-schooled students (n = 86), students working on completing a GED (n = 8), 
and students who could not locate or provide a high school code (n = 5). In estimating the ICC, 
these students were accounted for in different ways, but ICC values were very similar between 
these methods. 12 Therefore, in all subsequent analyses, each of these three groups of students was 
treated as a cluster (i.e., all home-schooled students comprised a single cluster). 13 

Modeling Technique 

In this study, students were clustered within schools, but very sparsely as there were 4,541 different 
schools and only 6,440 students. Most schools were comprised of only a single student (weighted 
mean = 1 .4 students per school, weighted median = 1.1). Previous studies used fixed effects 
models where dummy-coded indicators for different schools were entered into the model (Noble et 
al., 1999a, 1999b) to account for the variability attributable to high school attended. For the current 
study, however, a more parsimonious approach was desired. 

Furthermore, model-based methods typically utilized for clustered data in education (e.g., 
hierarchical linear models) may be of questionable utility when the clusters exhibit such a high level 
of sparseness (Clarke, 2008; Clarke & Wheaton, 2007; McNeish, 2014). Specifically, problems 
arise when estimating random effects based on highly sparse clusters. As a result, the variance 
components have been found to be overestimated with sparse clusters, which can propagate 
throughout the model and potentially lead to biased standard errors and possibly biased fixed-effect 
point estimates as well (Primo, Jacobsmeier, & Milyo, 2007). However, McNeish (2014) showed 
that design-based methods (DBMs) that account for clustering were far less affected by sparsely 
clustered data and performed well in simulation conditions with an ICC of 0.20; 200 clusters; and 
an average cluster size of 1 .5. This design closely mirrors the attributes of the current dataset. 
Therefore, DBMs were used to account for clustering in this study. 

Specifically, a blockwise regression model with cluster-robust standard errors (CR-SEs; Huber, 

1967; White, 1984; White, 1980) was used to assess the relation between noncognitive 
characteristics with ACT scores over and above HSGPA, high school coursework, and characteristics 
of the high school attended. A blockwise approach means that sets of predictor variables were 
entered into the model together. Changes in R 2 (the percentage of variation explained by the block 
of predictors) were germane to the research question. Fortunately, parametric DBMs (e.g., cluster- 
robust standard errors, also called sandwich or empirical estimators) are able to preserve R 2 such 
that it is asymptotically equivalent to what would be obtained with ordinary least squares (OLS) 
regression (Hayes & Cai, 2007). Specifically, because cluster-robust standard errors account for 
the clustering through a statistical correction rather than by including random effects, as is done in 
multilevel models, comparing the reduction in the residual variance of a target model compared to 


12 The three approaches included: (1) treating each group as a cluster, (2) excluding these students from the analyses, and (3) coding 
each student as a cluster of size one. 

13 Similar regression coefficients and p-values from the final models were obtained when these students were excluded from the 
analyses. 
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an intercept-only model can be shown to be algebraically equivalent to an OLS R 2 calculation. This 
cannot be said of multilevel models because the expected mean square is formulated differently 
when random effects are present. 14 

ACT Score Models 

Five separate regression models were developed, one for each of the five ACT scores (Composite, 
English, mathematics, reading, and science). Relevant candidate predictor variables obtained from 
the previously described instruments or PCA results were placed into seven different blocks based 
on the nature of the variables. The candidate predictors and their corresponding block and block 
category assignments are shown in Table 2. To simplify presentation of some of the results, blocks 
were further categorized into the following four “block categories”: high school academic factors, 
school characteristics, noncognitive characteristics, and demographics. By design, blocks containing 
high school student academic factors (grades, coursework) entered the model first so that the 
incremental effect of noncognitive characteristics could be evaluated. 

Table 2: Variables Included in Each Block of Predictors 

Block 

Category Block Name and Associated Variables 

Block 1 : High School Grades Earned 
23-Course HSGPA or Subject-Specific HSGPA (Based on higher R 2 ) 1 

Block 2: Course Information 

Core Curriculum Indicators 

(English, Math, Social Science, Natural Science, Overall) 

Math Course Sequence Taken 
(AAG is referent) 

Science Course Sequence Taken 
(Biology is referent) 

Social Studies Courses Taken 

(Government, Economics, Geography, Psychology, Other History) 

Years of Foreign Language 

Block 3: Advanced Coursework Indicators 
Accelerated, Advanced, Honors, and/or Dual-Credit English Course Taken 
Accelerated, Advanced , Honors, and/or Dual-Credit Math Course Taken 
Accelerated, Advanced , Honors, and/or Dual-Credit Natural Science Course Taken 
Accelerated, Advanced , Honors, and/or Dual-Credit Social Science Course Taken 
Number of College Credits Earned in High School 
(0; 1 -6; 7 or more) 
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14 A more detailed discussion of both model-based and design-based methods to account for clustering is provided in Appendix B. 
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Block 

Category 

Block Name and Associated Variables 

High School Characteristics 

Block 4: High School Characteristics 

Median Income for High School Zip Code 2 
(< $35,421 ; $35,421 -$47,852; > $47,852) 3 

Percent Anticipating Graduate Degree 4 (Mean = 42°/o) 1 
Percent Enrolling in College 4 (Mean = 71%)’ 

Percent Free/Reduced Lunch (FRL) 5 
(< 24%; 25%-50%; > 51 %) 3 

Percent of High School Taking the ACT test 4 (Mean = 62%)’ 

Percent Minority 5 

(< 8%; 9%-36%; > 37%) 3 

Non-Public School Indicator 

Noncognitive Characteristics 

Block 5: Noncognitive Characteristics 

Academic Commitment Component 
(from PCA) [and quadratic term] 

Expected Education Level 

(Below Bachelor's, Bachelor's, Beyond Bachelor's) 

Hours spent studying per week outside of class 
(0-5; 6-10; 11 or more) 

Hours spent working for pay per week 
(0; 1-10; 11 or more) 

My parent/guardian is involved in my post-HS plans 6 

Strongly Agree, Moderately Agree, Slightly Agree, Slightly Disagree, 
Moderately Disagree, Strongly Disagree 

My parent/guardian checks to make sure 1 have completed assignments 6 
Always, Usually, About Half the Time, Occasionally, Never 

My school challenges me to perform to the best of my ability 6 
Always, Usually, About Half the Time, Occasionally, Never 

Need Help Math Skills Indicator 

Need Help Reading Skills Indicator 

Need Help Educational/Occupational Plans Indicator 

Need Help Study Skills Indicator 

Need Help Writing Skills Indicator 

Perception of Education Component 
(from PCA) 

Plan to discuss postsecondary plans with counselor 

Have Done So, Have Not but Plan to, Have Not and Will Not 

Plan to find out what education is necessary for desired job 
Have Done So, Have Not but Plan to, Have Not and Will Not 

Self-Reported College Prep High School Curriculum Indicator 

Tested with the ACT in Junior Year Indicator 
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Block 

Category 


Block Name and Associated Variables 


Block 6: SES-Related Demographics 

English Spoken at Home Indicator 
Family Pre-Tax Annual Income 

(< $36,000; $36,000-$80,000; > $80,000 ) 

Highest Parental Education Level 

(No College, Some College, Bachelor's Degree, Graduate-Level Degree) 

Block 7: Gender and Ethnicity 

Female Gender Indicator 
Race/ethnicity 

(Asian, African American, Hispanic, Other/More than two ethnicities, 
White (referent)) 


Note: Italics indicates a binary variable. 

1 Variable was grand-mean centered. 

2 Based on 2000 US Census data. The median US household income in 2000 was $42,148 (US Census Bureau, 2014) and the 
2012 inflation-adjusted value was $55,030 (Noss, 2013). Year 2012 inflation-adjusted range categories include: < $46,246; 
$46,246-$62,476; > $62,476. The median US household income in 2012 was $51,371 (Noss, 2013). 

3 Categories based on sample tertiles. 

4 Variable was calculated for the ACT-tested population only (i.e., 2011 , 2012, and 2013 ACT- tested high school graduating classes). 

5 Available for public schools only. 

6 Variable was treated as continuous in regression models since there were more than five response categories. 


Predictor Variable Selection 

Variables were considered for inclusion in the model after a screening process was applied that 
was similar to that used by Noble et al. (1999a, 1999b). To be considered in the modeling process, 
variables needed to have a bivariate correlation of at least 0.05 with at least four of the five outcome 
variables (unless the variable was specific to a certain subject test, in which case it only had to have 
a 0.05 correlation with the corresponding subject test). Based on this criteria, the variables that were 
eliminated from consideration as predictors in the model included: taking an American Government 
course, taking an Economics course, taking a Geography course, taking a core curriculum in English, 
taking a core curriculum in social studies, the school percentage of students taking the ACT test, 
planning to discuss college plans with a high school counselor, and parents being involved with 
college plans. 15 

Using the remaining possible predictors, a stepwise selection procedure was employed within 
each block using a significance level threshold of 0.01 to determine the relevant predictors. Once a 
predictor was included based on statistical significance, it was retained in the model regardless of 
whether the statistical significance changed after subsequent blocks were added. 

All analyses were conducted using the SAS 9.2 software package. To circumvent software 
limitations for stepwise selection with CR-SEs, a two-step procedure that takes advantage of 
the known property that the standard errors for OLS regression will be consistently downwardly 
biased (underestimated) was used. For each block, an OLS stepwise regression was conducted to 
identify predictors (or multi-parameter factors) that were significant at the 0.01 (two-tailed) level 

15 The academic commitment component was hypothesized to have a quadratic component, so the linear correlation may not have been 
appropriate and it was therefore retained in the model even though it did not meet the initial inclusion criteria. 
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prior to accounting for clustering. Then, the predictors that were significant from the OLS model 
were modeled employing CR-SEs to correct the standard errors for clustering. Variables that were 
significant at the 0.01 level after using the CR-SEs were kept in the model. 

Comparing Differences in ACT Scores among Student 
Demographic Groups 

To evaluate whether other student and school characteristics— such as academic preparation, course 
grades, and behaviors— help to explain demographic group differences in ACT scores, unadjusted 
and adjusted mean differences in ACT scores were compared using the following student 
demographic characteristics: race/ethnicity (White vs. African American and White vs. Hispanic), 
gender, annual family income ($36,000 to $80,000 vs. < $36,000; and > $80,000 vs. < $36,000), 
and parental education level (some college vs. no college; bachelor's degree vs. no college; and 
graduate degree vs. no college). 

Relating Noncognitive Variables and HSGPA 

In order to examine the possibility that noncognitive characteristics may influence ACT scores 
through HSGPA, HSGPA was predicted from the noncognitive variables that were included in 
Block 5. For further information, see Table 2 (Blocks 1 through 4 were not entered into the model). 
Automatic selection procedures were used to decide which predictors to include in the final model. 
First, stepwise selection using a 0.01 p-value threshold was used. Since the HSGPA model was not 
a blockwise model, selection was compared to the more modern hybrid Least Angle Regression 
selection/least squares estimation method (LARS; Belloni & Chernozhukov, 2013; Belloni, 
Chernozhukov, & Hansen, 2014; Efron, Hastie, Johnstone, & Tibshirani, 2004). This was done to 
evaluate the extent to which the results from the ACT score models might depend on the stepwise 
method used. Interested readers should see Appendix C for more details about LARS. Because the 
ratio of sample size to candidate predictors is high, results are expected to be similar between the 
two selection methods. 
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Results 


Descriptive Statistics 

Table 3 shows the demographic information for the study sample (n = 6,440) and for all high school 
seniors who took the ACT during the 2012-13 academic year (n = 975,702). For the sample, raw 
frequencies and weighted percentages are reported. Across all variables, the weighted percentages 
were very close to the percentages observed for all high school seniors nationally who took the ACT 
in 2012-13. The largest differences in the percentages between the sample and population were 
associated with students' self-reports on the various needing help indicators. In addition, average 
HSGPA and ACT Composite and subject scores were similar between the sample and population of 
high school seniors (see Table 4). 

Table 3. Student Characteristics for the Study Sample and High School Seniors Nationally 
Who Took the ACT in 201 2-1 3 


Sample 201 2-1 3 HS Seniors 


Characteristic 

Categories 

n 

Weighted °/o 

n 

% 


White 

3,505 

54% 

537,967 

55% 


African American 

917 

15% 

151,325 

16% 

Race/Ethnicity 

Hispanic 

1,023 

15% 

147,918 

15% 


Asian 

432 

7% 

45,021 

5% 


Other 

563 

9% 

93,471 

10% 


Female 

4,234 

57% 

546,601 

56% 

Gender 


Male 

2,206 

43% 

429,101 

44% 


< $36,000 

2,128 

35% 

239,214 

33% 

Family Income 

$36,000 to $80,000 

2,348 

36% 

247,166 

34% 


> $80,000 

1,964 

29% 

239,261 

33% 


Less AAG 

199 

4% 

49,450 

5% 


Other-Low (< 3 years) 

76 

1% 

14,329 

1% 


AAG 

1,352 

23% 

247,191 

26% 


AAGO 

1,540 

25% 

220,788 

23% 



AAGT 

745 

12% 

121.365 

13% 

sequence 


AAGOT 

883 

13% 

119,184 

12% 


AAGTC 

431 

6% 

48,312 

5% 


AAGOTC 

1,048 

14% 

113,459 

12% 


Other-High (> 3 years) 

166 

2% 

24,276 

3% 


Less than Biology 

96 

2% 

15,352 

2% 


Biology 

554 

10% 

103,913 

11% 


Sequence 

Biology & Chemistry 

2,783 

44% 

428,589 

45% 

Biology, Chemistry, Physics 

2,825 

42% 

380,457 

40% 


Other 3-year sequence 

182 

3% 

29,846 

3% 
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Sample 

201 2-1 3 HS Seniors 

Characteristic 

Categories 

n 

Weighted °/o 

n 

% 

Advanced English 1 

Yes 

3,342 

46% 

412,169 

42% 

No 

3,098 

54% 

563,533 

58% 


Yes 

2,882 

39% 

349,953 

36% 

Advanced Math 1 

No 

3,558 

61% 

625,749 

64% 

Advanced 

Yes 

2,827 

39% 

357,411 

37% 

Soc. Studies 1 

No 

3,613 

61% 

618,291 

63% 


Yes 

2,788 

38% 

343,109 

35% 

Advanced Science 1 

No 

3,652 

62% 

632,593 

65% 


Below Bachelor's Degree 

297 

6% 

54,663 

6% 

Expected Education 

Bachelor's Degree 

3,011 

50% 

461,687 

52% 


Beyond Bachelor's Degree 

3,132 

44% 

377,504 

42% 


No College 

1,327 

21% 

175,340 

21% 


Some College 

2,087 

33% 

241,694 

29% 

Parent Education 

1,640 

25% 

227,018 

28% 

Bachelor's Degree 


Graduate Degree 

1,386 

21% 

179,392 

22% 

Need Help— 

Yes 

3,280 

51% 

432,129 

44% 

Occ./Educ. 

No 

3,160 

49% 

543,573 

56% 

Need Help— 

Yes 

2,319 

38% 

322,409 

33% 

Math Skills 

No 

4,121 

62% 

653,293 

67% 

Need Help— 

Yes 

1,831 

30% 

235,558 

24% 

Reading 

No 

4,609 

70% 

740,144 

76% 

Need Help— 

Yes 

2,887 

47% 

407,515 

42% 

Study Skills 

No 

3,553 

53% 

568,187 

58% 

Need Help— 

Yes 

1,639 

26% 

201 ,787 

21% 

Writing Skills 

No 

4,801 

74% 

773,915 

79% 


Note: Percentages may not sum to 100% due to rounding. Percentages for high school seniors nationally who took the ACT in 2012-13 
were based on respondents only for the following characteristics: family income, math course sequence, science course sequence, 
expected education level, and highest parental education level. The percentages of missing responses for the population were as follows: 
26% for family income, 2% for math course sequence, 2% for science course sequence, 8% for expected education level, and 16% for 
highest parental education level. For the sample and the population, the Other racial/ethnic category included 3% and 5%, respectively of 
students who did not provide their race/ethnicity. 

1 Advanced coursework variables included accelerated, advanced placement, and honors courses; they did not include dual-credit courses 
in these analyses since this information was not available for the population. 
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Table 4. Average HSGPA and ACT Scores for the Study Sample and High School Seniors 
Nationally Who Took the ACT in 201 2-1 3 


Sample 201 2-1 3 HS Seniors 


Mean HSGPA 
(SD) 

Overall HSGPA 
(0.00-4.00) 

3.31 (0.55) 

3.29 (0.57) 


Composite 

21.3 (5.0) 

21.1 (5.0) 


English 

20.8 (6.3) 

20.6 (6.2) 

Mean ACT Score 
(SD) 

Mathematics 

21.1 (5.1) 

21.0 (5.1) 


Reading 

21.6 (6.0) 

21.4 (6.0) 


Science 

21.2 (5.0) 

20.9 (5.0) 


/Vote: The percentage of students from the population missing HSGPA was 14%. Weighted statistics are reported for the sample. 


ACT Score Models 

Regression coefficients from the final blockwise regression models are provided in Table 5. The 
incremental percentage of the variation explained by each block is also provided. Next, results are 
described block by block, including interpretation of some of the significant regression coefficients. 
All coefficients are to be interpreted as conditional on all other predictors in the model. 
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Table 5. Blockwise Regression 
Block Predictor 

Model Results 
Composite 

English 

Mathematics 

Reading 

Science 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. 

AR 2 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. AR 2 

Intercept 

19.80 

17.73 

20.14 

20.59 

20.45 

1 High School Grades Earned 

0.31 


0.28 

0.29 

0.20 

0.23 

Overall HSGPA 

2.18 

2.74 

2.05 

2.16 

1.83 

2 High School Course Information 

0.08 


0.05 

0.13 

0.04 

0.08 

Math Course Sequence 






Less than AAG 

-0.38+ 

-0.41 


-0.39 

-0.25 

-0.69 

AAG (referent) 






AAGO 

0.59 

0.58 

0.71 

0.57 

0.56 

AAGT 

0.54 

0.64 

0.82 

0.40 

0.41 

AAGOT 

1.33 

1.57 

1.63 

1.10 

1.21 

AAGTC 

2.04 

2.04 

2.62 

1.68 

2.01 

AAGOTC 

2.32 

2.37 

3.02 

1.86 

2.21 

Other ( >3 yrs) 

0.99 

0.94 

1.59 

0.50 

1.18 

Other ( <3 yrs) 

0.56 

0.58 


0.77 

0.38 

0.28 

Science Course Sequence 






Less than Biology** 

0.48 

0.58 


0.78 

— 

0.40 

Biology (referent) 






Biology, Chemistry 

0.27 

0.39 


0.34 

— 

0.18 

Biology, Chemistry, Physics 

0.53 

0.39 

0.82 

— 

0.60 

Other 3-year sequence 

0.12 

-0.08 


0.55+ 

— 

0.07 

Years of Foreign Language 

— 

0.10 

— 

— 

— 

3 Advance High School Coursework 

0.04 


0.05 

0.04 

0.04 

0.03 

Advanced English 

0.54 

1.13 

-0.15 

0.99 

— 

Advanced Math 

0.66 

— 

1.30 

— 

0.68 

Advanced Nat Science 

0.49 

0.67 

0.63 

0.42 

0.64 

Advanced Social Studies 

0.69 

1.10 

0.30 

1.12 

0.40 

College Credits Earned in HS 






0 (referent) 






1-6 

-0.04 

-0.12 

0.26* 

-0.09 

-0.03 

7 or more 

0.39 

0.26 

0.60 

0.42* 

0.44 
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Composite 


English 


Mathematics Reading 


Science 


Block 

Predictor 

Reg. 

Coeff. 

AR 2 

Reg. 

Coeff. 

AR 2 

Reg. 

Coeff. 

AR 2 

Reg. 

Coeff. 

AR 2 

Reg. 

Coeff. 

AR 2 

4 

High School Characteristics 
Median Zip Code Income 


0.09 


0.08 


0.07 


0.07 


0.07 


Low (referent) 












Middle 

0.48 


0.41 


0.46 


0.47 


0.53 



High 

0.67 


0.60 


0.70 


0.53 


0.72 



% College Enrollment 

— 


— 


0.01 


— 


0.01 



% Free/Reduced Lunch 












Low (referent) 












Middle 

-0.27 


-0.27 


-0.37 


-0.28 


-0.15 



High 

-0.51 


-0.59 


-0.59 


-0.44 


-0.33 



% Intending Graduate Degree 

0.03 


0.03 


0.02 


0.03 


0.01* 



Quadratic 

< 0.01 


< 0.01 


< 0.01 


< 0.01 


< 0.01 



% Minority 












Low (referent) 












Middle 

-0.16 


-0.15 


-0.23 


-0.14 


-0.09 



High 

-0.87 


-0.87 


-0.78 


-0.93 


-0.78 



Non-Public School Indicator 

-0.13 


0.70 


-0.76 


0.15 


-0.69 


5 

Noncognitive Characteristics 


0.06 


0.07 


0.04 


0.07 


0.04 


College Prep Course Curriculum 

0.34 


0.41 


— 


0.47 


0.28* 



Expected Ed. Attainment 












Below Bachelor's (referent) 












Bachelor’s Degree 

0.34 


0.50 


0.24 


0.29 


0.28 



Beyond Bachelor's Degree 

1.08 


1.34 


0.81 


1.21 


0.92 



Need Help— Educ./Occu. Plans 

— 


0.38 


— 


— 


— 



Need Help— Writing Skills 

— 


-0.26 


— 


— 


— 



Need Help— Study Skills 

— 


-0.34* 


— 


— 


— 



Need Help— Reading Speed and 
Comp. 

-1.33 


-1.69 


— 


-2.39 


-0.94 



Need Help— Math Skills 

-0.52 


— 


-1.49 


— 


-0.69 



Parents Check Assignments 

-0.31 


-0.41 


-0.24 


-0.35 


-0.23 



Perception of Educ. Component 

0.13 


— 


0.16 


— 


0.19 



Student Challenged by School 

-0.39 


-0.41 


-0.27 


-0.49 


-0.36 



Tested in Junior Year 

0.77 


1.35 


0.58 


0.64 


0.74 



19 


















■ ACT Research Report A Multidimensional Perspective of College Readiness 


Block 

Predictor 

Composite 

English 

Mathematics 

Reading 

Science 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. AR 2 

Reg. 

Coeff. AR 2 

6 

SES-Related Demographics 

0.01 

0.01 

<0.01 

0.01 

0.01 


English Spoken at Home 

0.70 

0.99 

— 

0.91 

0.68 


Family Income 







< $36,000 (referent) 







$36,000 to $80,000 

0.24 

0.37* 

0.16 

— 

0.22 


> $80,000 

0.39 

0.61 

0.46 

— 

0.26 


Highest Parental Education Level 






No College (referent) 







Some College 

0.36 

0.56 

0.15 

0.54 

0.21 


Bachelor's Degree 

0.61 

0.91 

0.35 

0.89 

0.34 


Graduate Degree 

0.73 

1.14 

0.35 

1.11 

0.44 

7 

Gender and Race/Ethnicity 

0.02 

0.02 

0.03 

0.01 

0.03 


Female 

-0.64 

— 

-1.14 

— 

-1.19 


Race/Ethnicity 







Asian 

-0.57 

-1.24 

0.85 

-1.43 

-0.58 


African American 

-2.04 

-2.28 

-1.67 

-2.13 

-2.07 


Hispanic 

-1.53 

-1.98 

-1.11 

-1.66 

-1.41 


Other 

-0.44 

-0.71 

-0.28 

-0.32 

-0.43 


White (referent) 






Final R 2 


0.61 

0.56 

0.60 

0.44 

0.49 

Root Mean Square Error 

3.13 

4.22 

3.21 

4.47 

3.54 


— indicates that the predictor was not significant for the particular outcome variables 
t indicates a p-value between 0.010 and 0.015 upon entry to the model 
* indicates a p-value between 0.010 and 0.015 in the final model 

tt sample size for the less than Biology course sequence was relatively small (< 100 students) 

Note: Grey shading indicates that the predictor was not statistically significant upon entry but was retained as part of a factor. Orange shading indicates that the 
predictor was statistically significant upon entry but was no longer significant in the final model. Weighted analyses were used. 
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High School Academic Factors 

Overall HSGPA accounted for the largest proportion of variance of any predictor in the model— 20% 
(reading) to 31% (Composite). Although both overall and subject-specific HSGPAs were considered, 
overall HSGPA accounted for more variation across all ACT scores and was thus the measure of 
HSGPA retained in the models. Conditional on all other predictors in the final model, a one-point 
increase in overall HSGPA above the grand-mean was predicted to increase ACT scores by 1 .8 to 
2.7 points, on average. 

Block 2 accounted for between 4% (reading) and 13% (mathematics) of additional variance in ACT 
scores beyond the variance accounted for by Block 1 . Taking higher-level mathematics courses 
was predicted to increase ACT scores in every subject area and for the Composite score. Figure 1 
shows the adjusted average ACT mathematics score for specific mathematics course sequences. 
Compared to students who took Algebra I, Geometry, and Algebra II, students who took mathematics 
courses beyond Algebra II had higher ACT mathematics scores by 0.7 to 3.0 points, on average. 

Science coursework entered the model for all tests except in reading. However, some of the specific 
science course sequence indicators were not significant predictors of performance on the ACT tests 
once additional blocks of variables entered the models, especially for the English test. Specifically, 
the Biology, Chemistry, and Physics course sequence indicator remained statistically significant 
for predicting ACT science, mathematics, and Composite scores only once all blocks had entered 
the models. The average ACT science score was 0.6 point higher for students who took Biology, 
Chemistry, and Physics than for those who only took Biology. 



Less than AAG Alg I, Geom, Alg I, Geom, Alg I, Geom, Alg II, Alg I, Geom, Alg II, 

Alg II Alg II, Other Adv. Other Adv. Math, Other Adv. Math, 

Math Trig Trig, Calc 

Figure 1 : Adjusted average ACT mathematics score by mathematics course sequence. 
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The number of years of foreign language taken entered the model only for the English test, but 
was not a significant predictor in the final model. The social studies courses (e.g., Psychology and 
other history) and core curriculum indicators (e.g., overall and in mathematics and science) did not 
enter any of the models after the math and science course sequences had entered. In general, 
these courses either had very high or very low participation rates or were found to be related to the 
mathematics and science course sequences. 

The variables in Block 3 accounted for between 3% and 5% of additional variance beyond that 
accounted for by Blocks 1 and 2. In general, students who took accelerated advanced, honors, 
and/or dual-enrollment coursework in high school had higher ACT scores than those who did not. 
The advanced coursework regression coefficients for the different subject areas did vary though, 
depending on which ACT score was being modeled. For example, students taking accelerated, 
advanced, honors, and/or dual-enrollment courses in English were predicted to score about 1.0 
to 1 .1 points higher on the ACT reading and English tests, on average. In contrast, performance 
on the ACT mathematics and science scores was not related to whether a student took advanced 
coursework in English. Additionally, for all tests except in English, students expecting to earn 
seven or more college credits while in high school were predicted to score between 0.4 and 0.6 
point higher than students expecting to earn zero college credits. In total (for Blocks 2 and 3), 
the coursework taken by students in high school accounted for between 8% (reading) and 17% 
(mathematics) of the variance in ACT scores, beyond that accounted for by HSGPA. 

High School Characteristics 

The high school characteristics in Block 4 accounted for between 7% and 9% of additional 
variance in ACT scores beyond the factors included in Blocks 1 through 3. All of the high school 
characteristics that met the initial screening process entered at least one of the models. Two of 
the significant characteristics were proxy measures for school and neighborhood poverty. Students 
who attended schools located in zip code areas associated with mid and high values for median 
household income were predicted to score, on average, between 0.4 and 0.5 point and between 
0.5 and 0.7 point higher respectively than students from neighborhoods with low values for median 
household income. Moreover, ACT English, mathematics, and Composite scores were negatively 
related to school percentages of FRL-eligible students. Students in schools with higher FRL-eligible 
percentages were typically predicted to score 0.5 to 0.6 point lower than students from schools with 
lower FRL-eligible percentages. On average for ACT mathematics scores, students from schools 
with percentages of FRL-eligible students in the middle range were predicted to score 0.4 point 
lower than students from schools with lower FRL-eligible percentages. 

School characteristics related to the college-going culture of the high school attended were also 
found to be related to performance on the ACT. For example, the school percentage of ACT-tested 
students enrolling in college the fall following high school graduation was significantly related to 
ACT mathematics and science scores. Additionally, the percentage of ACT-tested students at a 
school aspiring to earn a graduate degree had a significant, positive quadratic term in relation to ACT 
scores (see Table 5), meaning that this school characteristic was non-linearly related to performance 
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on the ACT tests. For instance, as illustrated in Figure 2, the average ACT Composite score is almost 
1 .0 scale score point higher for students at schools with 62% of ACT-tested students aspiring to 
a graduate degree, compared to students at schools with a corresponding percentage of 42% 

(the sample mean). The predicted increase in ACT Composite score was positive for students from 
schools with more than 41 % of students at the school aspiring to a graduate degree. 16 

Students in schools that had a higher percentage of racial/ethnic minority students were 
predicted to score about 0.8 to 0.9 point lower, on average, than students from schools with a 
lower percentage of racial/ethnic minority students. In the final model, there was not a significant 
difference in average ACT scores between students from schools that had percentages of racial/ 
ethnic minority students in the middle and lower ranges. Lastly, compared to students attending non- 
public schools, those attending public schools generally scored about 0.8 and 0.7 point higher on 
ACT mathematics and science tests, respectively. 



Figure 2: Predicted increase in ACT Composite score as a function of the percentage of 
students at school aspiring to earn a graduate degree. 


16 About 56% of schools had less than 41 % of students at the school aspiring to a graduate degree. 
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Noncognitive Characteristics 

The block of noncognitive characteristics accounted for between 4% and 7% of additional 
variance in ACT scores, beyond HSGPA, high school coursework taken, and school characteristics 
(Blocks 1 through 4). Several aspects of students’ educational goals and values were positively 
related to performance on the ACT. First, students reporting plans to pursue a post-baccalaureate 
degree were predicted to score 0.8 (mathematics) to 1 .3 (English) points higher, on average, than 
students intending to pursue a degree below a bachelor's degree (e.g., an associate’s degree). 
Second, students with higher perceptions on the value of education tended to have slightly higher 
ACT mathematics, science, and Composite scores, compared to those with lower values on this 
component (by 0. 1 to 0.2 point for each one-unit increase). Third, higher average ACT scores were 
found for students who had taken the ACT test prior to their senior year and for students who had 
described their high school coursework as a college preparatory curriculum, compared to those who 
did not (by 0.6 to 1 .4 score points across all tests and by 0.3 to 0.5 point across all tests except 
in mathematics, respectively). Fourth, students who indicated needing help with their educational/ 
occupational plans were predicted to score higher on the ACT English test (by 0.4 point). 

The remaining significant predictors were negatively related to performance on the ACT. For 
instance, students indicating that they need help with certain academic skills were predicted to 
have lower ACT scores, on average. Specifically, students reporting that they need help on reading 
speed and comprehension scored lower on all tests except in mathematics (by 0.9 to 2.4 points), 
and students reporting that they need help with their math skills were predicted to have lower 
ACT mathematics, science, and Composite scores (by 0.5 to 1.5 points). In the final model, while 
indications of needing help with study skills remained related to ACT English scores, indications 
of needing help with writing skills did not. Additionally, ACT scores in all four subject areas were 
negatively related to the frequency at which students felt challenged by their high school coursework 
(by a one-unit incremental effect of 0.3 to 0.5). 

Although parental involvement is often thought to positively influence student educational outcomes, 
one of the parental involvement predictors (parents involved in post-high school plans) did not enter 
any of the models and the other (parents check that my assignments are complete) was negatively 
related to ACT scores. 17 This latter result suggests that students whose parents more frequently 
check their assignments tended to score lower than those whose parents less frequently check 
their assignments (by a one-unit incremental effect of 0.2 to 0.4 point). The following noncognitive 
characteristics did not enter the ACT score models: academic commitment component, hours spent 
studying per week outside of class, hours spent working for pay per week, and plans to find out what 
education is necessary for desired jobs. 

Demographic Characteristics 

After accounting for HSGPA, high school coursework taken, school characteristics, and noncognitive 
student characteristics (Blocks 1 through 5), student demographic characteristics (Blocks 6 and 7) 
explained a fairly small proportion of the variance in ACT scores (2% to 4%). For the SES-related 
characteristics in Block 6, students with higher annual family incomes, higher parental education 
levels, and those from families where English is the primary language spoken at home tended 


17 This result was consistent with the bivariate relationship between the frequency with which parents check assignments and ACT 
scores. More elaboration is provided in the discussion. 
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to score higher on ACT tests, as compared to their corresponding peers. However, these results 
varied by subject area. In the final models, English being the primary language spoken at home was 
related to all ACT scores except mathematics. Average ACT Composite, English, and mathematics 
scores significantly differed among income groups; average ACT Composite and mathematics 
scores differed between higher- and lower-income students only. Even after accounting for all 
other predictors in the model, as parental education level increased, so did average ACT Composite, 
English, and reading scores. 

Gender and racial/ethnic differences in average ACT scores were also found (Block 7). Compared 
to male students, female students tended to score 1.1 to 1.2 points lower on ACT mathematics and 
science; no gender differences were found in average ACT English and reading scores. The adjusted 
gender difference in average ACT Composite scores was 0.6 point. Lastly, African American and 
Hispanic students generally scored lower than White students across all tests by 1 .7 to 2.3 points 
and 1 .1 to 2.0 points, respectively. Asian students tended to score 0.9 point higher than White 
students, on average, on the mathematics test, but they typically scored 1 .4 and 1 .2 points lower on 
the reading and English tests, respectively. 

To evaluate the impact of including both student- and school-level demographic characteristics in 
the models, additional models were estimated that excluded certain school-level characteristics. For 
example, compared to the regression coefficients from the final models (Table 5), only small changes 
were observed in regression coefficients for the student-level racial/ethnic categories when the 
school-level minority percentage was removed from Block 4; they tended to increase in magnitude 
by at most 0.4 point across all tests. Similarly the student-level income coefficients (in Block 6) 
increased by at most 0.1 point across all tests, when the school-level income-related predictors were 
removed from Block 4. 
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Summary of Final Models 

In summation, the total amount of variance explained across all five ACT scores ranged from 44% 
(reading) to 61 % (Composite). Figure 3 shows the percentage of variance explained by each block 
of predictors. High school academic factors, such as HSGPA and high school coursework taken 
(Blocks 1 through 3) accounted for the greatest proportion of explained variance in all five ACT 
test scores (R 2 = 0.28 to 0.46). These three blocks alone comprised 64% to 77% of the total 
variance explained by the models. Figure 4 shows the percentage of variance accounted for by each 
major block category: high school academic factors, school characteristics, noncognitive student 
characteristics, and student demographic characteristics. 


i HSGPA (Block 1) 

1 Courses Taken (Block 2) 

Advanced Coursework (Block 3) 

High School Characteristics (Block 4) 


Noncognitive Characteristics (Block 5) 
1 SES-Related Demographics (Block 6) 
Gender & Race/ Ethnicity (Block 7) 


0.08 0.04 0.09 0.06 0.01 0.02 


Composite 


English 


Mathematics 



0.29 


0.20 


0.05 0.05 0.08 0.07 0.01 0.02 


< 0.01 

0.13 0.04 0.07 0.04 0.03 


0.04 0.04 0.07 0.07 °' 01 0.01 


Reading 


Science 


0.23 


0.08 0.03 0.07 0.04 0,01 0.03 



0,00 0,10 0.20 0.30 0.40 0.50 

Total Variance Accounted For 


0,60 0.70 


Figure 3: Proportion of variance in ACT scores explained by each block of predictors. 
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■ School Characteristics 


Noncognitive Characteristics 
■ Demographics 


0.06 0.03 


Composite 



English 


0.38 0.08 0.07 0.03 



0.46 


0.07 0.04 0.03 


Mathematics 



Reading 


0.28 0.07 0.07 0.02 



0.34 


0.07 0.04 0.04 


Science 


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 

Total Variance Accounted for 

Figure 4: Proportion of variance in ACT scores explained by overall block categories. 
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Unadjusted and Adjusted Mean Differences by Student 
Demographic Characteristics 

To evaluate the extent to which other student and school characteristics explain differences 
observed in mean ACT scores among student demographic groups, unadjusted and adjusted 
mean differences among these groups were compared. These comparisons were examined for the 
following student demographic characteristics: race/ethnicity, gender, annual family income, and 
parental education level. 

Race/ethnicity. Unadjusted mean differences in ACT scores between White and African American 
students ranged from 4.2 points (mathematics) to 5.6 points (English). Similarly, unadjusted 
mean differences between White and Hispanic students were rather large, ranging from 2.7 
points (mathematics) to 4.9 points (English). However, after accounting for HSGPA, high school 
coursework taken, school characteristics, noncognitive student characteristics, and other student 
demographics, mean differences were reduced by nearly 60% and ranged from 1 .7 to 2.3 points 
for White and African American students and ranged from 1.1 to 2.0 points for White and Hispanic 
students (see Figure 5). 

Gender. Unadjusted mean ACT scores were 1.1 score points higher in mathematics and science for 
male students than for female students, and were 0.3 to 0.4 point lower in reading and English for 
males. Even after accounting for the other student and school characteristics, differences persisted 
in mean ACT mathematics and science scores between male and female students. The adjusted 
mean ACT English and reading scores did not significantly differ between male and female students. 


6.0 


5.0 


4.0 


3.0 


2.0 


1.0 


Unadjusted ■ Adjusted 


0.0 



Composite English Math Reading Science 
White vs. African American 


Composite English Math Reading Science 
White vs. Hispanic 


Figure 5: Unadjusted and adjusted mean differences in ACT scores by race/ethnicity. 
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Family income. Unadjusted mean differences in ACT scores ranged between 2.0 points 
(mathematics) and 3.1 points (English) between middle- and lower-income students and from 
3.7 points (science) and 5.3 points (English) between higher- and lower-income students. After 
accounting for the other student and school predictors in the final regression models, the mean 
differences were reduced by between 87% and 95% (see Figure 6). For example, the unadjusted 
mean difference in average ACT reading scores between higher- and lower-income students was 
reduced from 4.3 points to 0.2 point, after other student and school characteristics were taken into 
account. 


■Unadjusted ® Adjusted 
5.3 



Composite English Math Reading Science Composite English Math Reading Science 


$36,000-$80,000 vs. < $36,000 > $80,000 vs. < $36,000 

Figure 6: Unadjusted and adjusted mean differences in ACT scores by family income. 
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Parental Education. Compared to students whose parents had no college experience (first- 
generation students), the unadjusted means in ACT scores were 1.1 points to 2.3 points higher for 
students whose parents had some college experience, 2.9 points to 4.6 points higher for students 
whose parents completed a bachelor’s degree, and 4.2 points to 6.3 points higher for students 
whose parents completed a graduate-level degree. Adjusted mean differences in ACT scores 
between these groups were substantially smaller (reduced by at least 74%; see Figure 7), after other 
student and school characteristics were taken into account (Table 5). The largest adjusted mean 
difference across ACT test scores and parental education level was 1.1 scale score points. 

HSGPA Predicted from Noncognitive Student Characteristics 

Table 6 shows the noncognitive student characteristics that were found to be significantly related 
to HSGPA based on stepwise and LARS selection procedures. For both selection methods, the 
noncognitive predictors accounted for 29% of the variance in HSGPA. In comparison to the final 
models presented in Table 5, the following noncognitive predictors were positively related to HSGPA 
but were not significantly related to ACT scores: hours spent studying per week outside of class, 
hours spent working for pay per week, and the academic commitment component. Moreover, 
students indicating that they need help with their study skills were predicted to have lower HSGPAs 
than students who did not indicate needing such help. In contrast, this predictor was significantly 
related to ACT English scores only (Table 5). 

The remaining predictors of HSGPA were also found to be significantly related to ACT scores 
and they were related in the same direction. These findings suggest that noncognitive student 


7.0 


6.0 


5.0 


4.0 


3.0 


2.0 


1.0 


0.0 



Some College vs. No College Bachelor's Degree vs. No College Graduate Degree vs. No College 


Figure 7: Unadjusted and adjusted mean differences in ACT scores by parental education 
level. 
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characteristics, such as academic commitment and hours spent studying, are also indirectly related 
to performance on the ACT through their effects on HSGPA. 

Differences in the selected predictors and their coefficients were rather small between the stepwise 
selection procedure using p-values and the LARS procedure. The same predictors were selected by 
the two procedures with the exceptions of the following variables nof being selected by the LARS 
procedure: students indicating that they need help with their educational and occupational plans 
and that they are challenged by their high school coursework. Results for these two methods were 
compared and presented to help address the question of whether the use of a traditional stepwise 
selection procedure would lead to vastly different final models than alternative selection procedures 
that did not rely upon p-values. These findings suggest that there are minimal concerns with the use 
of a stepwise selection procedure for the ACT models in this study. 

Table 6. Stepwise Selection Results for Predicting HSGPA from Noncognitive Factors 


Variable 

Stepwise 

Estimates 

LARS 

Estimates 

Intercept 

3.09 

3.06 

Academic Commitment Component 

0.13 

0.12 

College Prep Curriculum 

0.14 

0.14 

Educational Perception Component 

0.04 

0.03 

Expected Education Attainment 
Below Bachelor's (referent) 

Bachelor's 

0.20 

0.20 

Beyond Bachelor's 

0.39 

0.37 

Find High School Challenging 

-0.04 

— 

Hours Spent Studying per Week 
0-5 Hours (referent) 

6-10 Hours 

0.07 

0.06 

11 or More Hours 

0.13 

0.12 

Hours Spent Working per Week 
0 Hours (referent) 

1-10 Hours 

0.07 

0.07 

11 or More Hours 

0.03* 

0.02* 

Need Help— Educ./Occ. Plans 

0.05 

— 

Need Help— Math Skills 

-0.19 

-0.18 

Need Help— Study Skills 

-0.13 

-0.11 

Parents Check Assignments 

-0.02 

-0.03 

Took ACT Junior Year 

0.12 

0.13 

R 2 

0.29 

0.29 

Root Mean Square Error 

0.46 

0.46 


Note: All variables are significant at the 0.01 level unless noted with an asterisk (*)• 
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Discussion 

Similar to the results from two prior studies conducted in 1999 (Noble et at, 1999a, 1999b) on 
the relationships between noncognitive characteristics and ACT scores, this study on a more 
recent cohort of students found that between 44% and 61 % of the variance in ACT scores can be 
explained by HSGPA, coursework taken, school characteristics, noncognitive student characteristics, 
and demographic characteristics. Over and above HSGPA, coursework taken, and school 
characteristics, noncognitive variables accounted for between 4% and 7% of the variance in ACT 
scores. This percentage was slightly higher than that for advanced coursework (between 3% and 
5%) and slightly lower than that for school-level characteristics (between 7% and 9%), conditional 
on variables entering the model in previous blocks. It is important to note that the noncognitive 
characteristics available in this study do not represent the universe of noncognitive factors; this 
could have affected the percentage of variance explained by the noncognitive characteristics. 

In contrast to the current study, the Gaertner and McClarty study (2015) found a much larger 
percentage of variance accounted for by noncognitive factors related to students' behaviors, 
motivation, and social engagement (32% vs. 4% to 7% in this study). However, it should be noted 
that their college readiness index included both HSGPA and ACT/SAT test scores while this 
study used HSGPA as a predictor of ACT test scores. Moreover, their academic achievement and 
noncognitive predictors were measured from when students were in middle school. This study used 
more proximal measures of academic preparation (HSGPA and high school coursework taken) 
and noncognitive characteristics for a sample of high school seniors in the midst of the college 
planning process. In this study, when ACT Composite score was regressed solely on the available 
noncognitive characteristics, the percentage of the variance in scores accounted for by these 
predictors was 33%, a value more in line with the findings of Gaertner and McClarty (2015). The 
focus of this study, however, was to evaluate the incremental variance explained in ACT scores by 
noncognitive characteristics beyond the traditional predictors. 

Academic Factors 

The rigor or academic intensity of the high school curriculum (especially in mathematics) has 
been shown to be a key indicator for whether a student is ready for and will succeed in college 
(Achieve, 2008; Adelman, 2006). Therefore, given the content and purpose of the ACT tests, it is 
not surprising that ACT scores have a strong relationship with both the courses taken in high school 
and the grades earned in these courses. The level of the coursework taken in high school was found 
to be strongly related to performance on the ACT, even after accounting for high school grades. 
Specifically, taking higher-level mathematics and science coursework was associated with higher 
ACT scores in most subject areas (by up to 3.0 scale score points). This is not to say that other 
courses taken, including English and social studies courses, were unrelated to ACT performance. 

In general, some of these other courses had limited variability (either most taken or not taken). 

This study also found that taking advanced, accelerated, honors, and/or dual credit coursework in 
specific subject areas was associated with higher ACT scores, by at most 2.9 scale score points if 
a student took advanced coursework in English, mathematics, natural science, and social studies. 
Beyond these effects, students expecting to earn seven or more college credit hours typically had 
higher ACT scores than those not expecting to earn any college credit hours while in high school (by 
0.4 to 0.6 point). 
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School Characteristics 

The percentage of variance accounted for by the school-level variables was consistent with what 
has been found in previous studies. Specifically, although Gaertner and McClarty (2015) had nearly 
40 different school-level characteristics that comprised an overall school-level factor and Noble et 
al. (1999a, 1999b) included fixed effects for each school, this study found a similar percentage of 
variance accounted for by school-level characteristics using fewer school characteristics (7% to 9% 
versus 7.5% and 5% to 7%, respectively). The specific school demographic and climate/culture- 
related characteristics used in this study have also been found to be related to student achievement 
in other studies (Swanson, 2004; Kim & Sunderman, 2005; Cook & Evans, 2000; Oseguera, 2013). 
School demographic characteristics were included in this study as possible proxies for information 
related to school resources, school environment, and quality of education (Alliance for Excellent 
Education, 2013; Comer, 2005). Some studies have suggested that the demographic composition 
of schools and school working conditions have a high impact on where higher-performing teachers 
are employed, even beyond teacher salary (Baugh & Stone, 1982; Hanushek, Kain, & Rivkin, 1999; 
2004; Hanushek & Luque, 2000). 

After controlling for other student and school characteristics, the finding that public schools 
outperformed non-public schools on the ACT mathematics and science tests may initially seem 
counterintuitive. However, this finding in mathematics has been noted previously by Lubienski et al. 
(2008); the authors used the 2003 National Assessment of Educational Progress (NAEP) data set 
that included fourth and eighth graders. The explanation provided for this finding was that public 
schools tend to employ more certified mathematics teachers and utilize more reform-oriented 
mathematics teaching practices, which include using calculators, non-number emphasis, and 
discovery learning, with less emphasis on rote learning and routine procedures. 

School-level characteristics are often considered factors that students cannot change, including 
the quality of education that they receive. Because the school climate and culture can influence 
students’ aspirations, engagement, academic behaviors, and achievement (Akey, 2006; O’Brennan 
& Bradshaw, 2013), school-level characteristics were included in the models prior to noncognitive 
student characteristics. However, the overall results of the study did not significantly change when 
these blocks (4 and 5) were reversed in the models. 18 Unfortunately, school characteristics analyzed 
were necessarily limited to those available for all schools, especially in terms of those measuring 
school climate and culture. 

Noncognitive Characteristics 

In this study, we found that higher educational plans and perceptions of education were associated 
with higher ACT scores, even after accounting for academic factors and school characteristics. One 
possible explanation for these findings is that these students may be more motivated and engaged 
in their learning because of their academic goals and values. Higher levels of student engagement 
have been found to be related to higher levels of academic achievement (Heller, Calderon, & 
Medrich, 2003; Lee & Shute, 2009). 

Given that a majority of the students included in this study were from states whose ACT-tested 
population represents college-bound students, the interpretation of the finding that students who 


18 If the noncognitive characteristic block was specified to enter the model prior to the school-level characteristics, then the variance 
accounted for by the noncognitive characteristics remained in a similar range (4% to 9%). 
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took the ACT earlier during their junior year tended to have higher test scores might also be related 
to motivation. Students who have a stronger desire to enroll in college may take the test earlier to 
better understand their college options, gauge their college readiness, or to allow ample time to 
retest if they believe they can score better on subsequent attempts. 19 Other ACT studies have found 
retesting to be associated with similar score gains (Schiel & Valiga, 2014). 

In this study, indicating a need for help with reading and mathematics skills and the frequency with 
which students were challenged by their high school coursework were both negatively related to 
ACT scores. The results for the need for help in certain academic skills were consistent with those 
reported in the earlier studies by Noble et al. (1999a, 1999b). ACT English scores were more 
strongly related to indicating a need for help in reading than in writing. This finding could be due 
to these two variables being moderately related (r = 0.45). ACT science scores were related to 
indicating a need for help in both reading and mathematics; students were not asked about their 
need for help in science. In general, students appear to have a good idea about those areas in 
which they need additional help. Unfortunately, we do not know whether study respondents acted 
upon their indications of needing help after receiving their ACT score reports. Future studies might 
consider following such students to further examine this issue. 

Parents checking completion of homework assignments more frequently was negatively related 
to performance on the ACT. This finding seems counterintuitive because parental involvement has 
generally been found to positively influence student achievement (Comer, 2005; Henderson & Mapp, 
2002). However, in a review of the literature, Henderson and Mapp (2002) cited a couple of studies 
that found parental at-home involvement to be negatively related to test scores and grades. The 
authors indicated that the findings suggested that more at-home help was provided for struggling 
students than for those not struggling academically. In this study there was not a statistically 
significant interaction effect between HSGPA and parents checking completion of homework 
assignments on ACT scores. However, the association of this parental involvement factor with ACT 
scores was in the same direction in unadjusted and adjusted analyses. 

The number of hours spent studying per week outside of class and the academic commitment 
component were not found to be related to ACT scores. These factors were, however, related to 
HSGPA. These findings agree with another study (Allen, Robbins, Casillas, & Oh, 2008) that found 
that students’ level of academic discipline (as measured by ACT Engage®) was statistically related 
to HSGPA, but was not related to ACT scores. Motivational processes, self-efficacy, and education 
plans likely factor into coursework selection and grades earned. 20 Considering that HSGPA and 
noncognitive characteristics were shown to be fairly highly related and that HSGPA entered the 
model first, any overlap in variance accounted for in ACT scores by HSGPA and noncognitive 
characteristics would be attributed to HSGPA. 

The modeling techniques used in this study were not able to detect mediation effects that may be 
present. This was why we also evaluated the relationships between the noncognitive characteristics 
and HSGPA. A study by Noble, Roberts, and Sawyer (2006), using structural equation modeling, 
found that ACT scores were directly influenced only by academic achievement in high school 


19 At the time of data collection, eight states administered the ACT to all public high school juniors. These states accounted for 19% of 
the study sample. Within these states, 91 % of respondents had taken the ACT as a junior compared to only 48% of students from all 
other states. 

20 For example, in Gaertner and McClarty (2015) study, participation in advanced and accelerated courses loaded on both their 
achievement and motivation components. 
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as measured by grades earned and coursework taken. Education-related accomplishments and 
activities and perceptions of self and others (other noncognitive characteristics) had only indirect 
effects on ACT scores through their academic achievement measure. Their findings may help explain 
why the noncognitive characteristics accounted for a small percentage of the variance in ACT scores 
beyond the traditional predictors. 

Student Demographics 

Despite ample contrary evidence (Mattern, Patterson, Shaw, Kobrin, & Barbuti, 2008; Radunzel & 
Noble, 2013; Sanchez, 2013), some researchers have suggested that standardized test scores like 
the ACT test unfairly disadvantage underrepresented minority and lower-income students in the 
college admissions process (Soares, 2012). Others contend that standardized test scores simply 
capture SES (Colvin, 1997; Guinier & Torres, 2002; Kohn, 2001). However, Sackett, Kuncel, Arneson, 
Cooper, and Waters (2009) largely dispelled this latter claim by showing that the relation between 
standardized achievement test scores and college grades— when partialing out the effect of SES— is 
nearly identical to the zero-order relation (r = 0.44 vs. 0.47), which indicates that nearly all of the 
variability between college grades and standard achievement test scores is independent of SES. 

Results from this study suggest that differential performance on the ACT among student 
demographic groups is largely attributable to differential academic preparation. Specifically, after 
accounting for HSGPA, high school coursework taken, school characteristics, and other noncognitive 
student characteristics, SES and other demographic characteristics (including parental education 
level, race/ethnicity, and gender) accounted for a small percentage of the variance in ACT scores 
(4% or below). Additionally, differences in ACT scores among racial/ethnic, family income, and 
parental education level groups were substantially reduced when students’ academic preparation 
and achievement levels were taken into account. 21 Other research (Sawyer, 2008) suggests that 
differential performance on test scores starts early, and that improved high school coursework 
and grades benefits students with greater prior achievement more than those with lower prior 
achievement. School-level demographic characteristics, along with other school-level characteristics, 
were included in the models to account for high school attended. In subsequent analyses, when the 
school-level demographic factors were excluded, student-level racial/ethnic and income regression 
coefficients were only slightly higher, by at most 0.4 point, than those reported from the final models. 

Even though gender did not contribute much to the percentage of variation in ACT scores beyond 
that explained by other student and school characteristics, mean gender differences in ACT 
mathematics, science, and Composite scores persisted in the final models (by 1.1, 1.2, and 0.6 
points, respectively). Future research should explore factors related to gender differences in ACT 
scores. One such area might include evaluating gender differences in students' academic self- 
concept in mathematics and science— that is, their belief that they can do well in these subject areas 
(Rosen, 2010). In this study, female students were more likely than male students to indicate that 
they need help with improving their mathematics skills (41% vs. 34%). This finding held even among 
students scoring higher on the ACT mathematics test. 22 


21 ACT score differences among these demographic groups would likely have been even smaller if standardized measures of prior 
achievement had been included in the multiple regression models. 

22 For both male and female students, the average ACT mathematics score was lower for those who indicated that they need help in 
mathematics than for those who did not indicate needing such help. 
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Application for Alternative Statistical Techniques 

This study applied statistical techniques for sparsely clustered data and model selection that are 
not commonly used in educational research, but that have been shown to be effective either in 
simulation studies or in other disciplines. A DBM approach using CR-SEs was the regression 
method used to account for students being sparsely clustered within schools. Differences in 
regression coefficients and standard errors estimated from OLS versus CR-SE methods were not 
large. However, across all predictors and outcome variables, the standard errors estimated by OLS 
were about 10% smaller, on average, than those estimated from CR-SE. This finding is roughly 
in line with what would be expected from the design effect (Kish, 1965). Additionally, since many 
selected variables were highly significant and the design effect was fairly small, 23 predictors that 
were on the boundary of significance were the most affected by the clustering with about three or 
four variables per model changing significance between OLS and CR-SE (always in the direction of 
being significant with OLS but non-significant with CR-SE). 

Stepwise selection regression was the method used to identify predictors related to performance 
on the ACT. A common argument against using stepwise selection procedures is that R 2 values 
and regression coefficients can be upwardly biased and lead to model overfitting (Thompson, 

1995, 2001). Modern selection methods, such as LARS or the related least absolute shrinkage 
and selection operator (LASSO), overcome these concerns, but are not able to accommodate 
order restrictions (i.e., requiring certain predictors to be in the model before other predictors are 
considered) that are needed in a blockwise analysis. As a result, these methods were not a viable 
option for the ACT score models in this study (Hesterberg, Choi, Meier, & Fraley, 2008). However, 
when comparing the results between the hybrid LARS and stepwise selection procedures for 
the HSGPA model, very few differences were found in the predictors selected, and the R 2 and 
regression coefficients were similar between these two methods. Moreover, with over 300 
observations per candidate predictor, overfitting concerns are likely to be small for the models in this 
study (Babyak, 2004). 

Limitations and Future Research 

The relatively high non-response rate of greater than 80% associated with the online questionnaire 
may limit the generalization of the results because those who responded may differ from those who 
did not. In fact, respondents may represent a more motivated group of students. The sample was 
weighted to account for important demographic and academic differences between respondents 
and the population. Weighting, however, does not address potential unobserved differences between 
respondents and non-respondents. When high school coursework, grades, student demographics, 
and the needs help indicators between the weighted sample and the entire population (i.e., variables 
available for both groups) were compared, no major differences were found in the distributions for 
these variables. 

Moreover, even though a considerable percentage of the variability in ACT scores was explained, 
a fair amount remained unaccounted for by the student and school characteristics included in this 
study— from 39% (Composite) to 56% (reading). While part of the unexplained variance is due to 


23 The design effect is calculated by 1 + (m - 1 )p where m is the average cluster size and p is the ICC. The design effect gives an 
estimate of the ratio of the sampling variance with clustering compared to the sampling variance assuming independence. The design 
effect for this study was 1 .08 to 1 .09, which means that the sample variance is 1 .08 to 1 .09 times bigger than it would be if the study 
data was based on the same sample size under independence among students. 
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measurement error in the subject tests (average standard error of measurement is about 2.0 scale 
score points; ACT, 2014b), most of the remaining unexplained variance is likely due to the limited 
number of noncognitive and school characteristics available. 24 For instance, the online questionnaire 
focused more on students’ college planning activities, intentions, and financial concerns— 
characteristics that did not seem to be relevant for performance on the ACT. Future studies should 
incorporate a more extensive list of noncognitive characteristics, including using instruments 
designed to measure specific constructs such as motivation, self-regulation, and academic self- 
efficacy. Future studies might also consider accounting for measurement error in the variables 
studied. Moreover, one could investigate relationships of postsecondary outcomes (e.g., enrollment, 
persistence, and first-year grades in college) with ACT scores and the other student and school 
characteristics included in this study. 

Implications 

The implications of the study findings are similar to those outlined fifteen years ago by Noble et 
al. (1999a). First, in order for students to achieve higher ACT scores, and thus be better prepared 
academically for college, they need to focus on taking rigorous courses in high school and earning 
good grades. In particular, taking mathematics courses beyond Algebra II, science coursework 
that includes a Physics course, and/or advanced, accelerated, honors, or dual-credit coursework 
in multiple subject areas appears to benefit students. Research by others suggests that students 
also need to develop strong academic behaviors and study skills early on, even before entering high 
school, to succeed in college (Conley, 2007). 

Second, counselors and teachers can support students by encouraging them to do well in school, to 
have high aspirations and perceptions on the value of education, and to seek help when they need it. 
Students appear to have a good idea about those areas in which they need additional help, such as 
in reading and mathematics. Counselors and teachers can provide these students with the support 
and resources they need to improve their academic skills. 

Third, although not directly measured in this study, all students need to be provided with a 
challenging, quality education, including equal access to rigorous high school coursework and the 
opportunity to earn college credits while in high school (Clinedinst, 2015; Gagnon & Mattingly, 2015; 
Handwerk, Tognatta, Conley, & Gitomer, 2008). This responsibility falls to administrators, teachers, 
and counselors, as well as the parents and communities that support the school system. Research 
by others suggests that positive school climates featuring high-quality academic instruction and high 
levels of academic expectations, student engagement, and parental involvement can contribute to 
improved student achievement and increased college aspirations and access (Alliance for Excellent 
Education, 2013; Heller et al., 2003; O’Brennan & Bradshaw, 2013; Oseguera, 2013). ■ 


24 The amount of explained variance is capped by the reliability of the individual subject scales (typical estimates range from 0.83 in 
science to 0.92 in English; ACT, 2014b). 
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Appendix A 


Table A-1. Items that loaded with magnitude above 0.40 on components 


Component 

Item 

Loading 

Academic Commitment 


0.73 1 


1 turn in assignments on time. 

0.81 


1 am well-prepared for in-class activities (e.g., discussion, pop 
quizzes, etc.). 

0.81 


1 put forth my best effort in my schoolwork. 

0.81 


It will be difficult to discipline myself to keep my academic 
commitments in college, such as attending classes and being 
prepared for them. 

-0.42 

Perception of Education 


0.44 1 


It is important to take mathematics and/or science classes 
during my senior year of high school. 

0.67 


The benefits of earning a college degree are well worth the 
costs. 

0.71 


Regardless of the obstacles or hardships 1 face in college, 1 am 
committed to completing a college degree. 

0.62 


1 The loading column for the component row is the internal consistency reliability (as measured by Cronbach's a) of the items loading 
with magnitude above 0.40 on the component. 


45 


■ ACT Research Report A Multidimensional Perspective of College Readiness 


Appendix B 

Additional Detail on Model-Based and Design-Based Methods to Account for Clustering 

Model-Based Methods 

Model-based methods account for clustering by specifically incorporating the clustered data 
structure in the model through random effects and/or a particular within-cluster residual covariance 
matrix. The predominant model-based method is multilevel models (MLMs; also called hierarchical 
linear models, random coefficient models, mixed models). Using Laird and Ware (1982) notation, 
MLMs for continuous outcomes can be written asY = X B + Z b + 8., where Y. is a vector 
of responses for cluster j , X ; . is a design matrix for the fixed effects of cluster j, p is a vector of 
fixed effects, Z . is a design matrix for the random effects of cluster j, b j is a vector of random 
effects for cluster j where £(b / . ) = 0 and Cov (b ■ . ) = G, and 8. is a matrix of residuals of the 
observations in cluster j where E(b) — 0 and Cov (8 ) = Ry. MLMs produce cluster-specific 
inferences such that the mean of the outcome is conditional on both the values of the predictor 
variables and the values of the random effects (i.e., £[Y,|X,,b,] for Y a vector of outcomes for 
cluster j, X. a matrix of predictors for cluster j, and b y a vector of random effects). 

MLMs are implemented when there is an explicit interest in modeling both the mean and the 
covariance of the outcome such that E[Yj | X ; . ,b ; . ] = X.p and Var( Y.) = Z / GZ I ( + R . 
Because Var( Y.) is of explicit interest, the covariance structures for G and R must be properly 
specified. Otherwise, standard error estimates and point estimates may be estimated with bias. 

Design-Based Methods 

Design-based methods (DBMs) account for the clustering through statistical corrections to address 
violations of single-level model assumptions rather than through incorporating aspects of the 
clustering directly into the model which is often accomplished through random effects. The two 
common DBMs are (1 ) the semi-parametric generalized estimating equations (GEEs, Liang & Zeger, 
1986) which are estimated with quasi-likelihood and (2) the parametric cluster-robust estimators 
(CR-SEs, also called sandwich or empirical estimators) which are typically estimated with ordinary 
least squares (OLS) or likelihood methods. 

Standard OLS regression is formulated by Y = X(5 + 8 for Y, an 77 x 1 vector of outcomes, X, 
an 7? x p design matrix, and 8, an 77 x 1 vector of residuals assumed to be distributed 

0,(T 2 ). Under OLS, the regression coefficients have a closed form solution such that 
P = (X^-'X'Y. The standard errors of the regression coefficients are taken from the square 
root of the diagonal elements of variance of Var($) which is most generally calculated by 
Vcir($) = (X t X) _1 X t 88 t X(X t X) _ 1 . Assuming independently and identically distributed residuals, 
88 T can be summarized by the average squared residuals cr 2 = ( 77 — p) _1 88 T which results in a 
block diagonal matrix cr 2 I where cr 2 is the residual variance. The estimate of Vfr7'(P) then simplifies 
to (ct 2 )(X t X)- 1 X t X(X t X)- 1 , and given that (XTXJ-'X'X = I, Vflr(P) = ct 2 (X t X) 1 if residuals 
are independently and identically distributed. However, when the independence assumption is 
violated, summarizing 88 T with (77 — /?) _1 88 T is not appropriate and results in estimates of Vcir([ 5) 
being too small, meaning that the standard errors are underestimated and, consequently, type I 
errors are inflated (Cameron & Miller, 2013). 
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When data are dependent through clustering, the residuals of observations within clusters are 
likely related, meaning that the assumption of independently and identically distributed residuals 
is unlikely to be upheld. Robust standard errors (perhaps more appropriately called empirical or 
heteroskedasdicity-corrected covariance estimators) address this problem by replacing the 
average of squared residuals ( n —p)~ l EE T with the squared residual ££ T which does not require 
diagonal elements to be identical. After this substitution, ££ T can no longer be summarized by 
cr 2 I and thus the variance of the regression coefficients is equal to its original formulation as 
Var($) = (X t X)- 1 X t ££ t X(X t X)- 1 (White, 1980). 

To address violations of the residuals being independently distributed, X T ££ T X is calculated for 
each cluster individually and then summed across all clusters, X. t e^eTX j .. This quantity is then 

7=1 

pre and post multiplied by (X T X) _1 to obtain the standard errors that account for clustering (i.e., 

cluster robust standard errors) such that Vfor CR (|3) = (X T X) _1 ^(X^ T £^£ / T X.)(X T X) _1 (White, 

7=1 

1984). For non-linear models and/or models estimated with maximum likelihood, the same principles 
can be applied except that the residuals will be calculated differently and estimates will not have 
closed form solutions. 

The square root of the diagonal elements of Vfar CR ((5) provides standard errors for the regression 
coefficients that account for the dependency in the data due to the clustering of observations. Note 
that the clustering was accounted for without any random effects that are required in MLMs— the 
residuals are used to assess the degree of dependence and then to correct Vfrr((J) accordingly. As 
a result, CR-SEs (as well as GEEs) are population-averaged models, meaning that estimates apply 
to the average over all clusters rather than being cluster-specific as with MLMs (i.e., E [Y / [ X^] 
in DBMs (Cameron & Miller, 2013)). This interpretation is identical to a single-level model which 
makes sense since a CR-SE model is essentially a single-level model with a statistical correction 
for clustered observations. That is, DBMs are useful when the regression coefficients are of interest 
and partitioning the variance within and between levels is not relevant nor are inferences for specific 
clusters. Although DBMs and MLMs are representative of different quantities, cluster-specific and 
population-averaged estimates can be shown to be equivalently interpreted with continuous (but not 
discrete) outcomes (Fitzmaurice, Laird, & Ware, 2012). 

More conceptually, in DBMs, the clustering is considered a nuisance that has to be dealt with and 
prevents one from fitting a single-level model. In contrast, clustering in MLMs is a substantively 
interesting part of the research question to be explicitly modeled and tested to determine how the 
variance is partitioned between and within cluster levels. In DBMs, neither the random effects nor 
their covariance structure need to be specified. In place of random effects, clustering is alternatively 
accounted for indirectly through the effect of clustering through the residuals, which are affected 
when assumptions are violated. DBMs use the information from the residuals to statistically correct 
standard errors to account for the clustering of the data. The variance is not partitioned between 
levels, no random effects are estimated, and no variance components are output. The resulting 
output resembles a standard single-level regression (e.g., OLS for metric outcomes) except the 
standard errors have accounted for the clustered nature of the data. 
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Appendix C 

More Detail on Least Angle Regression (LARS) 

LARS is a forward selection algorithm that encompasses a variety of regularization methods as 
special cases or extensions. Regularization methods are used to account for overfitting by applying 
a penalty term in the estimation process (e.g., the loss function in OLS; see McNeish, 2015, for an 
overview of regularization as relevant to behavioral sciences). LARS uses the correlation between 
candidate predictors and the residual to select relevant predictors as opposed to p-values or R 2 that 
are commonly used in stepwise procedures. LARS can select and estimate coefficients by itself 
or can also be implemented as a more general, computationally efficient method to obtain least 
absolute shrinkage and selection operator (LASSO) or ridge regression estimates. As a result of 
LARS, regularization methods can be estimated in the same amount of time as an equivalent OLS 
model (Efron, Hastie, Johnstone, & Tibshirani, 2004). 

LARS and related methods improve upon traditional automatic selection methods that use p-values 
or R 2 by addressing concerns such as overfitting, inflated R 2 values, and over-selection of variables 
due to inflated type-1 errors resulting from repeated testing (Flom & Cassell, 2007). These concerns 
are more prevalent when the ratio of sample size to candidate predictors is small, which is not the 
case in the current study (Babyak, 2004; Hesterberg, Choi, Meier, & Fraley, 2008). 

Although LARS can output regression coefficients through LASSO (also known as £ penalization) 
or its own estimation process, Efron et al. (2004) also outlined a hybrid method whereby the LARS 
algorithm selects relevant variables but least squares or another estimation method is used to 
estimate regression coefficients. Belloni and Chernozhukov (2013), Belloni, Chen, Chernozhukov, 
and Hansen (2012), and Belloni, Chernozhukov, and Hansen (2014) also discuss using LARS or 
related regularization methods to select predictors and then an alternative estimation method for 
the regression coefficients. The HSGPA model employed this hybrid method by obtaining estimates 
through CR-SEs once the LARS algorithm identified the relevant predictors. 

Standard LARS operates on the full design matrix of predictors, so it was not a candidate selection 
procedure for the blockwise ACT score models because variables that were significant in earlier 
blocks would not be assured to be kept in the model when subsequent blocks of predictors were 
added (Hesterberg et at, 2008). There are grouped versions of LASSO, but these methods define 
group as multi-parameter factors rather than group as blocks of predictors. Grouped LARS would 
not be appropriate for blockwise regression since the regularization is applied to the group as a 
whole, which would not be desirable in the context of blockwise regression. Additionally, grouped 
methods are not currently available in SAS, the statistical software package that was used in this 
analysis. 

LARS selection can be implemented in SAS through Proc GlmSelect using the Selection = LAR 
option in the Model statement. Due to the quickly expanding literature on these alternative methods 
and the open-source platform of the software program R, R has more advanced capabilities with 
LARS and related methods (LASSO in particular) than SAS does. 


48 





ACT 


ACT is an independent, nonprofit organization that provides assessment, research, 
information, and program management services in the broad areas of education 
and workforce development Each year, we serve millions of people in high 
schools, colleges, professional associations, businesses, and government agencies, 
nationally and internationally. Though designed to meet a wide array of needs, all 
ACT programs and services have one guiding purpose— helping people achieve 
education and workplace success. 

For more information, visit www.act.org. 


